18 research outputs found

    SentiBench - a benchmark comparison of state-of-the-practice sentiment analysis methods

    Get PDF
    In the last few years thousands of scientific papers have investigated sentiment analysis, several startups that measure opinions on real data have emerged and a number of innovative products related to this theme have been developed. There are multiple methods for measuring sentiments, including lexical-based and supervised machine learning methods. Despite the vast interest on the theme and wide popularity of some methods, it is unclear which one is better for identifying the polarity (i.e., positive or negative) of a message. Accordingly, there is a strong need to conduct a thorough apple-to-apple comparison of sentiment analysis methods, \textit{as they are used in practice}, across multiple datasets originated from different data sources. Such a comparison is key for understanding the potential limitations, advantages, and disadvantages of popular methods. This article aims at filling this gap by presenting a benchmark comparison of twenty-four popular sentiment analysis methods (which we call the state-of-the-practice methods). Our evaluation is based on a benchmark of eighteen labeled datasets, covering messages posted on social networks, movie and product reviews, as well as opinions and comments in news articles. Our results highlight the extent to which the prediction performance of these methods varies considerably across datasets. Aiming at boosting the development of this research area, we open the methods' codes and datasets used in this article, deploying them in a benchmark system, which provides an open API for accessing and comparing sentence-level sentiment analysis methods

    Mídias Sociais e Administração Pública: Análise do sentimento social perante a atuação do governo federal brasileiro

    Get PDF
    Este estudo procurou identificar como a análise de sentimento, baseada em textos extraídos de mídias sociais, pode ser um instrumento de mensuração da opinião pública sobre a atuação do governo de forma a contribuir para a avaliação da administração pública. Trata-se de um estudo aplicado, interdisciplinar, exploratório, qualitativo e quantitativo. Foram revisadas as principais formulações teóricas e conceituais acerca do tema e realizadas demonstrações práticas, utilizando-se uma ferramenta de mineração de opinião que proporcionou precisão satisfatória no processamento de dados. Para fins de demonstração, foram selecionados temas que motivaram a realização da onda de protestos que envolveu milhões de pessoas no Brasil em junho de 2013. Foram coletadas, processadas e analisadas, aproximadamente, 130.000 mensagens postadas no Facebook e no Twitter sobre esses temas em dois períodos distintos. Por meio desta investigação, observou-se que a análise de sentimento pode revelar a opinião polarizada dos cidadãos quanto à atuação do governo

    Dynamics of news events and social media reaction

    No full text

    A relativistic opinion mining approach to detect factual or opinionated news sources

    Get PDF
    19th International Conference on Big Data Analytics and Knowledge Discovery, DaWaK 2017; Lyon; France; 28 August 2017 through 31 August 2017The credibility of news cannot be isolated from that of its source. Further, it is mainly associated with a news source’s trustworthiness and expertise. In an effort to measure the trustworthiness of a news source, the factor of “is factual or opinionated” must be considered among others. In this work, we propose an unsupervised probabilistic lexicon-based opinion mining approach to describe a news source as “being factual or opinionated”. We get words’ positive, negative, and objective scores from a sentiment lexicon and normalize these scores through the use of their cumulative distribution. The idea behind the use of such a statistical approach is inspired from the relativism that each word is evaluated with its difference from the average word. In order to test the effectiveness of the approach, three different news sources are chosen. They are editorials, New York Times articles, and Reuters articles, which differ in their characteristic of being opinionated. Thus, the experimental validation is done by the analysis of variance on these different groups of news. The results prove that our technique can distinguish the news articles from these groups with respect to “being factual or opinionated” in a statistically significant way.Scientific and Technological Research Council of Turkey under contract number 114E78

    Predicting Contradiction Intensity: Low, Strong or Very Strong?

    No full text
    International audienceReviews on web resources (e.g. courses, movies) become increasingly exploited in text analysis tasks (e.g. opinion detection, controversy detection). This paper investigates contradiction intensity in reviews exploiting different features such as variation of ratings and variation of polarities around specific entities (e.g. aspects, topics). Firstly, aspects are identified according to the distributions of the emotional terms in the vicinity of the most frequent nouns in the reviews collection. Secondly, the polarity of each review segment containing an aspect is estimated. Only resources containing these aspects with opposite polarities are considered. Finally, some features are evaluated, using feature selection algorithms, to determine their impact on the effectiveness of contradiction intensity detection. The selected features are used to learn some state-of-the-art learning approaches. The experiments are conducted on the Massive Open Online Courses data set containing 2244 courses and their 73,873 reviews, collected from coursera.org. Results showed that variation of ratings, variation of polarities, and reviews quantity are the best predictors of contradiction intensity. Also, J48 was the most effective learning approach for this type of classification

    Combining sentiment analysis scores to improve accuracy of polarity classification in MOOC posts

    No full text
    Sentiment analysis is a set of techniques that deal with the verification of sentiment and emotions in written texts. This introductory work aims to explore the combination of scores and polarities of sentiments (positive, neutral and negative) provided by different sentiment analysis tools. The goal is to generate a final score and its respective polarity from the normalization and arithmetic average scores given by those tools that provide a minimum of reliability. The texts analyzed to test our hypotheses were obtained from forum posts from participants in a massive open online course (MOOC) offered by Universidade Aberta de Portugal, and were submitted to four online service APIs offering sentiment analysis: Amazon Comprehend, Google Natural Language, IBM Watson Natural Language Understanding, and Microsoft Text Analytics. The initial results are encouraging, suggesting that the average score is a valid way to increase the accuracy of the predictions from different sentiment analyzers.info:eu-repo/semantics/publishedVersio